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Chapter 1 

State- Space Visualisation and 
Fractal Properties of 
Parrondo's Games 

Andrew Allison^ 

Charles Pearce^ 
Derek Abbott^ 

1.1 Introduction 

In Parrondo's games, the apparently paradoxical situation occurs where 
individually losing games combine to win [1. 2]. The basic formulation and 
definitions of Parrondo's games are described in Harmer et alii [3, 4, 5, 
6]. These games have recently gained considerable attention as they are 
physically motivated and have been related to physical systems such as the 
Brownian ratchet [4] , lattice gas automata [7] and spin systems [8] . Various 
authors have pointed out interest in these games for areas as diverse as 
biogenesis [9] , political models [8] , small- world networks [10] , economics [8] 
and population genetics [11]. 

In this chapter, we will first introduce the relevant properties of Markov 
transition operators and then introduce some terminology and visualisa- 
tion techniques from the theory of dynamical systems. We will then use 
these tools, later in the chapter, to define and investigate some interesting 
properties of Parrondo's games. 

We must first discuss and introduce the mathematical machinery, terms 
and notation that we will use. The key concepts are : 

state : This contains all of the information that we need to specify what 
is happening in the system at any given time. 

time-varying probability vector : This is a time-varying probability 
distribution which specifies the probabilities that the system will be 
in certain states and any given time. 
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transition matrix : This is a Markov operator which which determines 
the way in which the time varying probabiUty vector will evolve over 
time. 

These concepts are defined and discussed at length in many of the standard 
text books on stochastic processes [12, 13, 14, 15]. 

Time- homogeneous sequences of regular Markov transition operators have 
unique stable limiting state-probabilities. The state-space representations 
of the associated time- varying probability vectors converge to unique points. 
If the sequence of Markov transition operators is not homogeneous in time 
then the sequence time-varying probability vectors generated by the prod- 
ucts of these different operators need not converge to a single point, in the 
original state space. It is possible to construct quite simple examples to 
show that this is the case. 

If the sequences are periodic then it is possible to incorporate the fi- 
nite memory of these systems into a new definition of "state." The new 
inhomogeneous systems can be re-defined as strictly homogeneous Markov 
processes. These new Markov processes, with new states, will generally have 
unique limiting probability vectors. 

If we allow the sequence to become indefinitely long then the amount of 
memory required grows without bound. It is still possible, in principle, to 
define these indefinitely long periodic sequences as homogeneous Markov 
process although the definition, and encoding, of the states would require 
some care. We can consider any one indefinite sequence of operators as 
being one of many possible indefinite sequences of operators. If we do this 
then most of the possible sequences will appear to be "random." We can 
learn something about the general case by studying indefinitely long ran- 
dom sequences. 

If the sequence of operators is chosen at random then the time varying 
probability vector, as defined in the original state-space, does not generally 
converge to a single unique value. Simulations show that the time-varying 
probability vector assumes a distribution in the original state-space which 
is self-similar, or "fractal," in appearance. The existence of fractal geometry 
is established, with rigor, for some particular Markov games. We establish 
a transcendental equation which allows the calculation of the Hausdorff 
dimensions of these fractal objects. 

If state-transitions of the time-inhomogeneous Markov chains are associ- 
ated with rewards then it is possible to show that even simple, "two-state," 
Markov chains can generate a Parrondo effect, as long as we are free to 
choose the reward matrix. Homogeneous sequences of the individual games 
generate a net loss over time. Inhomogeneous mixtures of two games can 
generate a net gain. 

We show that the expected rates of return, or moments of the reward 
process, for the time-inhomogeneous games are identical to the expected 
rates of return from a homogeneous sequence of a time-averaged game. This 
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is a logical consequence of the Law of Total Probability and the definition 
of expected value. 

Two different views of the tinic-inhomogeneous process emerge, depend- 
ing of the viewpoint that one takes: 

• If you have access to the history of the time- varying probability vector 
and you have a memory to store this information and you choose to 
represent this data in state-space then yon will see distributions with 
fractal geometry. This is more or less the view that a large casino 
might have if they were to visualise the average states of their many 
customers. 

• If you do not have access to the time-varying probability vector or 

you have no memory in which to store this information then all that 
you can see is a sequence of rewards from a stochastic process. The 
internal details of this process are hidden from you. You have no 
way of knowing precisely how this process was constructed from an 
inhomogeneous sequence of Markov operators. There is no experiment 
that you can perform to distinguish between the time-inhomogeneous 
process and the time-averaged process. The time-averaged process is a 
homogeneous sequence of a single operator. We can calculate a single 
unique limiting value for the probability vector. This is more or less 
the view that a single, mathematically inclined, casino patron might 
have if they were playing against some elaborate poker machine. The 
internal workings of the machine would be hidden from the customer 
but it would be possible to perform some analysis of the outcomes 
and form an estimate of the parameters for the time-averaged model. 

We show that the time inhomogeneous process is consistent in the sense 
that the "casino" and the "customer" will always agree on the expected 
winnings or losses of the customer. In more technical terms, the time- 
average, which the customer sees, is the same as the ensemble-average over 
state-space, which the casino can calculate. 

1.2 Time-Homogeneous Markov Chains and 
Notation 

Finite discrete-time Markov chains can be represented in terms of matri- 
ces of conditional transition probabilities. These matrices are called Markov 
transition operators. We denote these by capital letters in brackets, eg : [A] 
where Aij = Pr {i^t-i-i = 3\Kt = i} and K & Z\s some measure of displace- 
ment or the "state" of the system. The Markov property requires that Ai^j 
cannot be a function of K but it can be a function of time, t. In Parrondo's 
original games, K, represents the amount of capital that a player has. There 
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is a one-to-one mapping between Markov games and the Markov transition 
operators for these games. We will refer to the games and the transition 
operators interchangeably. 

The probability that the system will be in any one state at a given 
instant of time can be represented by a distribution called the time- varying 
probability vector. We represent this probability mass function, at time t, 
using a row vector, Vt. We can represent the evolution of the Markov chain 
in time using a simple Matrix equation. 



Vt+i = Vt • [A] 



(1.1) 



This can be viewed as a multi-dimensional finite difference equation. The 
initial value problem can be solved using generating function, or Z trans- 
form, methods. Sequences of i(lentic;al Markov transition operators, where 
[A] does not vary, are said to be time-homogeneous. A Markov transi- 
tion operator is said to be regular if some positive power of that op- 
erator has all positive elements. Time-homogeneous sequences of regular 
Markov transition operators always have stable limiting probability vectors, 
limt^oo (Vt) = n. The time varying probability vector reliably converges 
to a single point [12, 13, 14, 15]. 

We can think of the space which contains the time- varying probability 
vectors, and the stable limiting probability vector, as a vector space which 
has a strong analogy to the state-space which is used in the theory of con- 
trol. We shall refer to this space as "state-space," [0, 1]^, and we will refer 
to the time-varying probability vector as a state vector. This terminology 
is used in the engineering literature [15]. We emphasise that the ''state- 
vector'," Vt € 3?^ is distinct from the ''state" of the system, K ^ Z, used 
in Markov chain terminology. As a simple example, we can consider the 
regular Markov transition operator 



[A] 



13 _3_ 

1 1 
16 16 



using the initial condition 



Vt = [Vb,yi] = 



3 1 
4'4 



when t = 0. 



(1.2) 



(1.3) 



The components of Vt are Vq and Vi and these can be considered to be 
the dimensions of the Cartesian space which we call "state-space". This 
space has a clear analogy with the phase-space of Poincare and the state- 
space used in the theory of control. It also has some analogy with the "7" 
or gaseous phase-space of Gibbs and the phase-space used in Lagrangian 
dynamics although we must be careful not to press these analogies too 
far since the state-spaces of physics and of Markov chains use different 
transition operators which obey different conservation laws. 
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A fundamental question in the study of dynamical systems is to classify 
how they behave as f — » cxo and all transient effects have decayed. The evo- 
lution of the state vector of a discrete-time Markov chain generally traces 
out a sequence of points or "trajectory" in the state-space. The natural 
technique would be to draw a graph of this trajectory. As an example of 
this, we can consider the trajectory of the time homogeneous Markov chain, 
described by Equations 1.2 and 1.3, which is shown in Figure 1.1. 

Transient response of a Markov chain 



^ 0.5 - 



0.2 

0.1 

qI , , 1 1 1 1 1 1 , 

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

\ 

FIGURE 1.1. State-space trajectory of a Markov chain 

The state vector, Vt, always satisfies the constraint, Vq + 14 — 1. This 
follows from the law of total probability. The state-vector is always con- 
strained to lie within an — 1 dimensional subspace of the N dimensional 
state-space. The dynamics of the system all occur within this sub-space. 
This is clearly visible in Figure 1.1. We can think of the set 

M = {[l/o, Vi] I (0 < V^, < 1) A (0 < < 1) A {Vo + Vi - 1)} , (1.4) 

as a state manifold for the dynamical system defined by Equations 1.2 and 
1.3. The state manifold has a dimension which is smaller than the embed- 
ding state-space. This is a result of the fact that there is a conservation 
law (the law of total probability) which constrains the dynamics of the 
system. For this example, the sequence converges to a stable fixed point at 
n = [i, |] . It can be shown that sequences of this type always converge to 
single stable fixed points as long as the Markov transition operators are reg- 
ular and time- homogeneous [12, 13, 14, 15]. The convergent points are the 
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appropriate state-space representation of the stable limiting probabilities 
for the Markov chain. 



1.3 Time-Inhomogeneous Markov Chains 

The existence, uniqueness and dynamical stability of the fixed point arc 
important parts of the theory of Markov chains but we must be careful 
not to apply these theorems to systems where the basic premises are not 
satisfied. If the Markov transition operators are not homogeneous in time 
then there may no longer a single fixed point in state-space. The state vector 
can perpetually move through two or more points without ever converging 
to any single stable value. To demonstrate this important point, we present 
a simple example, using two regular Markov transition operators ; 



[S] 



(1.5) 



and 



[T] 



1 3 

4 4 

1 3 

4 4 



(1.6) 



The rows of these matrices are all identical. This indicates that the out- 
come of each game is completely independent of the initial state. The lim- 
iting stable probabilities for these regular Markov transition operators are 
lis = [f ) i] S'lid = [jj f] respectively. The time- varying probability 
vector immediately moves to the stable limiting value after even a single 
play of each game. 

[Q] ■ [S] = [S] (1.7) 



and 



for any conformable stochastic matrix 
corollaries: 

[T] . [S] 



T] = [T] (1.8) 
. This leads to some interesting 



[S] 



and 



[S] • [T] = [T] 



(1.9) 



(1.10) 



If we play an indefinite alternating sequence of these games, {STST ■ • •}, 
then there are two simple ways in which we can associatively group the 
terms: 



^2N = Vo([5][T])([5][T]). 
= Vo[T] 

n = Ht 



'{[S][T]) 



(1.11) 

(1.12) 
(1.13) 
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and 



V2N+1 = (Vo[5])([T][5])([r][5])...([T][5]) (1.14) 
= Vo[5] (1.15) 

^ n = ns. (1.16) 

If we assume that there is a unique probability hmit then we must con- 
clude that lis = Ht and hence j = j which is a contradiction. We can 
invoke the principle of excluded middle (reductio ad absurdum) to con- 
clude that the assumption of a single limiting stable value for limj^oo (Vt) 
is false. In the asymptotic limit as t — > 00, the state vector alternately as- 
sumes one of the two values lis or IIt- We refer to the set of all recurring 
state vectors of this type, {lis, IIt}, as the attractor of the system. In 
more general terms an attractor is a set of points in the state-space which 
is invariant under the system dynamics in the asymptotic limit ast^oo. 



1.3.1 Reduction of the periodic case to a Time-Homogeneous 
Markov Chain 

In the last section, we considered a short sequence of length 2. This can 
be generalised to an arbitrary length, N ^ Z.lt is possible to associatively 
group the operators into sub-sequences of length A''. As with the sequences 
of length two, the choice of time origin is not unique. We are free to make 
an arbitrary choice of time origin with the initial condition at t = 0. We can 
think of the operators as having an offset oin & Z, where < n < — 1 
within the sub-sequence. We can also calculate a new equivalent operator to 
represent the entire sequence, Acq = H^^'^o^ -^n- We can then calculate the 
steady-state probabilities associated with this operator, Ileq = Ileq • Acq. 
We can refer the asymptotic trajectory of the time varying probability 

vector to this fixed point, V(t (mod n)) = Ileq • ni*=o^™°'^ ^ An- In the 
periodic case, there is generally not a single fixed point in the original 
state-space but the time varying probability vector settles into a stable 
limit cycle of length N . If we aggregate time, modulo N then we can re- 
define what we mean by "state" and we can define a new state-space in 
which the time- varying vector does converge to a single point. 

If we allow the length of the period, iV, to become indefinitely long 
N ^ 00 then our new definition of "state" becomes infinitely complicated. 
We would have to contemplate indefinitely large offsets, n ^ 00, within the 
infinitely long cycle. If we wish to avoid the many paradoxes that infinity 
can conceal then we really should consider the case with "infinite" period 
as being qualitatively different from the case with finite period, N. 
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1.3.2 Random Selection of Markov transition operators 

1.4 Two simple Markov Games that Generate a 
Simple Fractal in State-Space 

We proceed to construct a simple system in which operators arc selected 
at random and we will use the standard theories regarding probability and 
expected values to derive some useful results. If we modify the system 
specified by Equations 1.5 and 1.6 : 



[S] 



5 1 

? ? 
2 2 



(1.17) 



and 



[T] 



1 1 

2 2 
1 5 
6 6 



(1.18) 



and select the sequence of transition operators at random then the attractor 
becomes an infinite set. If we were to play a homogeneous sequence of either 
of these games then they would have the same stable limiting probabilities 
as before, Us and IIt, and the dynamics would be similar to those shown 
in Figure 1.1. In contrast, if we play an indefinite random sequence of the 
new games S and T, {STSSTSTTSTT- • •}, then there are no longer any 
stable limiting probabilities and the attractor has a fractal or "self-similar" 
appearance which is shown in Figure 1.2. 



1.4.1 The Cantor Middle-Third Fractal 

These games have been constructed in such a way that they generate the 
Cantor middle-third fractal. 

It should be noted that the Cantor Middle-Third fractal is an uncount- 
able set and so a, countably infinite, random sequence of operators will ever 
generate enough points to cover the entire set. The solution to this problem 
is to consider the uncountably infinite set generated by all possible infinite, 
random sequences of operators. We can construct a probability measure 
on the resulting set and then we can calculate probabilities and expected 
values. It is also reasonable to talk about the probability density function 
of the time- varying probability vector in the state-space. 

In order to stimulate intuition, we can simulate the process and gener- 
ate a histogram, showing the distribution of the time varying probability 
vector. The result is shown in Figure 1.3. For the x axis in this figure, we 
could have chosen the first element of the time varying probability vector, 
Vq but this would not have been the easiest way to analyse the dynam- 
ics. It is better if we choose another parameterization. If we examine the 
eigenvectors of the matrices in Equations 1.17 and 1.18 then we find that 
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FIGURE 1.2. A fractal attractor generated by games S and T 

a better re-parameterization is: 

x = Vo-Vi (1.19) 

and 



(1.20) 



Of course, we always have y — 1 and a; is a new variable in the range 
— 1 < x < +1. The Cantor fractal lies in the unit interval — i < a; < ^ 
which is the x interval shown in Figure 1.3. The transformation for matrix 
[S], in Equation 1.17 reduces to: 



Xt+l 



(1.21) 



and the transformation for matrix [T], in Equation 1.18 reduces to: 





1 






~ 3 ' 





(1.22) 



The transformation S has a fixed point ai x = +^ and the transformation 
T has a fixed point at x = — i. If we choose these transformations as 
random then the recurrent values of x lie in the interval between the fixed 
points, — 5 < X < ^- This is precisely the iterated function system for the 
Cantor Middle-Third Fractal. These are described in Barnsley [16]. 



result of an iterated function system 
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FIGURE 1.3. A histogram of the distribution of Vt in state-space 
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The most elementary analysis that we can perform is to calculate the 
dimension of this set. If we assume conservation of measure then every 
time we perform a transformation, we reduce the diameter by a factor of 
i but the transformed object is geometrically half of the original object so 
we can write 



2 ^3^ 



(1.23) 



where D is the fractional dimension. This is the law of conservation of 
measure for this particular system. We can solve this equation for D to get 
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D = 0.630929 • • •. 

We can invert the rules described in Equations 1.21 and 1.22 giving: 

xt = 3xt+i - 1 (1.24) 

and 

xt=3xt+i + l. (1.25) 

If we consider these equations, together with the law of conservation of total 
probability then we get a self-similarity rule for the PDF (or Probability 
Density Function), p{x), of the time varying probability vector, Vt : 

{3x -1) + Ip {3x + l)=p{x) . (1.26) 

This PDF, p{x) is the density function towards which the histogram in 
Figure 1.3 would converge if we could collect enough samples. The self- 
similarity rule for the PDF gives rise to a recursion rule for the moment 
generating function, $ (Q) = E (e-'^^) : 

$(f^) = $(|).cos(|) . (1.27) 

We can evaluate the derivatives at f2 = and calculate as many of the 
moments as we wish. We can calculate the mean, /x, and the variance cr^ : 

At = (1.28) 
cr^ = \ (1.29) 

These algebraic results are consistent with results from numerical simula- 
tions. 



1.4-2 Iterated Function Systems 

The cause of the fractal geometry is best understood if we realise that 
Markov transition operators perform affine transformations on the state- 
space. An indefinite sequence of different Markov transition operators is 
equivalent to an indefinite sequence of different affine transformations which 
is called an "Iterated Function System" . We refer the reader to the work of 
Michael Barnsley [16] and the theory of Iterated Function Systems to show 
that fractal geometry is quite a general property of a system of randomly 
selected affine transformations. 



1.5 An Equivalent Representation of the Random 
Selection of Markov Transition Operators 

Consider two mutually exclusive events, Ani? = 0, embedded within some 
probability space {^,T,P). Consider any third event C C ^4 U B. These 
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FIGURE 1.4. Set Relationships and Change of Probabihty 

events are represented in Figure 1.4. The law of total probability asserts 
that 

Pr(C) = Fr{C\A) ■ Pr{A) + Pt{C\B) ■ Pr(B) . (1.30) 
We can now make the following particular identifications: 

C = {Xen\ Kt+i A Kt^j} (1.31) 
A = {played game A} (1-32) 
B = {played game _B} . (1.33) 

If we select games A and B at random with probabilities of 7 and (1 — 7) 
respectively then we can write Pr(A) = 7 and Pr(i3) = (1 — 7) . By 
definition, the Markov matrices for games A and B contain conditional 
probabilities for state transitions : 

Ai^j = {{Kt+i = j \ Kt = i) A played game A} (1.34) 
= Pr{(ift+i =j\Kt = i) A played game B} . (1.35) 

Note that in this case C — AiJ B. We can define a new operator corre- 
sponding to the events C^j : 

C,,^^FY{Kt+i^3\Kt^i} (1.36) 

and Equation 1.30 reduces to 

C,,j - 7 + ^,,, -(1-7) . (1.37) 

The conditional probabihties of state transitions of the inhomogeneous 
Markov process generated by games A and B are the same as the condi- 
tional probabilities of a new equivalent game called "Game C." The tran- 
sition matrix for Game C is a linear convex combination of the matrices 
for the original basis games, A and B. Even if we have complete access to 
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the state of the system then there is no function that we can perform on 
the state, or state transitions, which could allow us to distinguish between 
a homogeneous sequence of Games C and an inhomogeneous random se- 
quence of Games A and B. We refer to game C as the time-average model. 
This is analogous to the state-space averaged model found in the theory of 
control [17]. 



1.6 The Phenomenon of Parrondo's Games 

1.6.1 Markov Chains with Rewards 

Suppose that we apply a reward matrix to the process: 

i?,,, = reward if {Kt+i = j) \ {Kt = i) . (1.38) 

There is a specific reward associated with each specific state transition. 
We can think of Rij as the reward that we earn when a transition occurs 
from state i to state j . The state transitions, rewards and probabilities of 
transition, for "Game A" are shown in Figure 1.5. The state transition dia- 



[R =+17] 

1 2 




^21 
[R =+17] 

2 1 

FIGURE 1.5. State Transition Diagram for "Game A" witli rewards. 

grams for "Game B" and the time averaged "Game C" would have identical 
topology and have identical reward structure, although the probabilities of 
transition between states would be different. Systems of this type have 
been analysed by Howard [18] although we use different, matrix, notation 
to perform the necessary multiplications and summations. 

The expected reward from each transition of the time-averaged homoge- 
neous process is : 

y,,, = [i?,j • c,,,] . (1.39) 

If we wish to calculate the mean expected reward then we must sum over 
all recurrent states in proportion to their probability of occurrence. This 
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will be a function of the transition matrix, C, and the relevant steady state 
probability vector, He : 

Y{C)=Uc-{[R]o[C])-V^ (1.40) 

where "o" represents the Hadamard, or element by element, product and 
is a unit column vector of dimension N. Post-multiplication by U"^ 
has the effect of performing the necessary summation. We recall that Tic 
represents the steady state probability vector for matrix C. The function 
Y (C) represents the expected asymptotic return, in units of "reward," per 
unit time when the the games are played. 

If we include the definition of C in Equation 1.37 in Equation 1.39 then 
we can write : 

Yij = E[R,^,-{jA^, + il-j)B,,,)] (1.41) 
= jE[Rij-Aij] + {l-j)E[Rij-Bij] . (1.42) 

We can also define : 

Y{A)=-nA{[R]o[A])V^ (1.43) 

and 

Y{B) =Tlji{[R]o[B])V^ (1.44) 
and we might falsely conclude that 

Y{C)=jY{A) + {l-j)Y{B) . (1.45) 

This would be equivalent to saying that : 

Y{C) = 7 (Ha {[R] o [A]) U^) + (1 - 7) (Hb {[R] ° [B]) U^) . (1.46) 

but these equations 1.45 and 1.46 are in error because Equation 1.42 must 
be summed over all of the recurrent states of the mixed inhomogeneous 
games but in the false Equation 1.46, the first term is summed with re- 
spect to the recurrent states of Game "A" and the second term is summed 
with respect to the recurrent states of game "B." This is an error. The 
dependency on state makes the reward process non-linear. The correct ex- 
pression for Y{C) would be : 

Y{C) = 7 (He m o [A]) U^) + (1 - 7) (He m o [B]) U^) . (1.47) 

The difference between the intuitively appealing but false Equations 1.45 
and 1.46 and the correct Equation 1.47 is the cause of "Parrondo's para- 
dox." 
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1.6.2 Parrondo's Paradox Defined 

The essence of the problem is that when we say that "Game A is losing" 
or "Game B is losing" we perform summation with respect to the steady 
state probability vectors for Games "A" and "B" respectively. When wc 
say that "a random sequence of games A and B is winning," we perform 
the summation with respect to the steady state probability vector for the 
time-averaged game, Game "C." 

We can say that the "paradox" exists whenever we can find two games 
A and B and a reward matrix R such that : 

Y (7A + (1 - 7) B) 7^ -fY{A) + (1 - 7) y{B) . (1.48) 

The "paradox" is equivalent to saying that the reward process is not a 
linear function of the Markov transition operators. 



1.6.3 A simple "Two-State" Example of Parrondo's Games 



We can show that Parrondo's paradox does exist by constructing a simple 
example. We can define 

"5 1 " 

f ! 
2 2 



and 



[A] = 

[B] = 



1 1 

! I 

6 6 



(1.49) 
(1.50) 



The steady state probability vectors are: IIa = and IIb = [3, f]- 

These games arc the same as games "S" and "T" defined earlier but we 
analyse them using the theory of Markov chains with rewards. We can 
define a reward matrix 



[R] 



-7 +17 
+17 -7 



and we can apply Equations, 1.43, 1.44 and 1.47 to get 



-7 +17 
+17 -7 



5 1 

! ! 
2 2 



and 



-7 +17 
+17 -7 





" 1 1 ■ 




■ 1 " 





? I 

.6 6 . 


) 


1 



and, for the time-average we get : 





■2 1 ■ 




■ 1 " 





! 1 
.3 3 . 


) 


1 



(1.51) 



= - 1 (1.52) 



1 (1.53) 



1 . (1.54) 
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Games "A" and "B" are losing and the mixed time-average game, game, 
C = ^{A + B), is winning. Equation 1.48 is satisfied and so we have Par- 
rondo's "paradox" for the two-state games "A" and "B" as defined in Equa- 
tions 1.49 and 1.50. We can simulate the dynamics of this two-state version 
of Parrondo's games. Some typical sample paths are shown in Figure 1.6. 
The results from the simulations are consistent with the algebraic results. 



Games A, B and mixed (AB) 
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FIGURE 1.6. Simulation of a Two-State version of Parrondo's games 



If we refer back to Figure 1.5 then an intuitive explanation for this phe- 
nomenon is possible. The negative or "punishing" rewards are associated 
with transitions that do not change state. The good positive rewards are 
associated with the changes of state. If we play a homogeneous sequence of 
Games "A' or "B" then there are relatively few changes of state and the 
resulting weighted sum of all the rewards is negative. If we play the mixed 
game then the rewarding changes of state are much more frequent and the 
resulting weighted sum of rewards is positive. 
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1.7 Consistency between State-Space and Time 
averages 

In order for the "fractal view" of the process, in state-space, to be consistent 
with the time average view of the process we require : 



£;[Vt] =nc 



(1.55) 



The vahic of -E[Vt] follows from the argument in Section 1.4.1. We can 
use the mean as defined in Equation 1.29 to state that 



^[Vt] = 



. , Ie[x],1-1-E[x] 
2 2 ^ ^ ' 2 2 ^ ^ 



1 1 1 
2 + 2^'2-2'^ 



1 1 

2' 2 



(1.56) 
(1.57) 
(1.58) 



The value of He follows from the arguments in Section 1.6.3. Specifically 
we require Tic = lie • C which gives: 



nc = 



1 1 

2' 2 



(1.59) 



which is consistent with Equation 1.58. Which proves this special case. To 
prove the more general case we need to have some notation for an entire 
fractal set, like the one shown in Figure 1.2. We use {F} to denote the 
attr actor generated by two operators A and B. We can write : 



E [{F}] = jE [{F}] A + (1 - 7) £ [{F}] B 



(1.60) 



This follows from conservation of measure under the afRne transformations 
A and B. We note that everything in these equations is linear and so we 
can write 



E[{F}] = E[{F}]i^A+{l-j)B) 
= E[{F}].C 

which is the defining property of Tic which implies that 

E [{F}] = Uc . 



(1.61) 
(1.62) 



(1.63) 



The two ways of viewing the situation are consistent which means that 
we can use the time averaged game to calculate expected values of returns 
from Parrondo's games. 
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1.8 Parrondo's original games 



1.8.1 Original Definition of Parrondo's Games 

In their original form, Parrondo's games spanned infinite domains, of all 
integers or all non- negative integers [3]. If our interest is to examine the 
asymptotic behaviour of the games as f — > oo and to study asymptotic 
rates of return or moments then it is possible to reduce these games by 
aggregating states of the Markov chain modulo three. We can do this with- 
out losing any information about the rate of return from the games. After 
reduction, the Markov transition operators take the form : 



[A] 





(1-ai) 

012 





(1 - as) 



(1 - ao) 
ai 




(1.64) 



where oq, ai and a2 are the conditional probabilities of winning, given the 
current state modulo three. This form of the games has been published by 
Pearce [6]. 



1.8.2 Optimised form of Parrondo 's Games 

Simulations reveal that periodic inhomogeneous sequences of Parrondo's 
games have the strongest Parrondo effect. Further investigation by the 
authors, using Genetic Algorithms, suggest that the most powerful form 
of the games is a set of three games that are played in a strict periodic 
sequence {Go,Gi,G2,Go,Gi,G2,- ■ ■}■ The transition probabilities are as 
follows : 

Game Go : [ao, ai, 02] = [^J., (1 - /x), (1 - n)] 
Game Gi : [oo, ai, 02] = [(1 - m): (1 - A^)] 
Game G2 : [ao, ai, 02] = [(1 - m): (1 - m). m] 

where /U. is a small probability, < fi < 1. We can think of /x as being a 
very small, ideally "microscopic" , positive number. The rate of return form 
any pure sequence of these games is approximately 

Y^yH (1.65) 

which is close to zero and yet the return from the cyclic combination of 

these games is approximately 

r « 1 - 3 • /i (1.66) 

which is close to a certain win. We can engineer a situation where we can 
deliver an almost certain win every time using games that, on their own, 
would deliver almost no benefit at all! These games clearly work better as 
a team than on their own. Just as team players may pass the ball in a game 
of soccer, the games {Go, Gi, G2} carefully pass the state vector from one 
trial to the next as this sequence of Parrondo's games unfolds. 
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1.8.3 An Exquisite Fractal Object 

It is possible to de-rate these games by increasing ^. In the hmit as ^ ^ ^ 
the Parrondo effect vanishes and the attractor collapses to a single point in 
state-space. Just before this limit the attractor takes the form of the very 
small and exquisite fractal shown in Figure 1.7. This fractal is embedded 




FIGURE 1.7. A 2D projection of a fractal attractor generated by the 
"last gasp" of Parrondo's games 

in a two dimensional sub-space of the three dimensional state-space of the 
games {Go,Gi,G2}. The two dimensional sub-space has been projected 
onto the page in order to make it easier to view. The projection preserves 
dot product, length and angle measure. The coordinates "x" and "y" are 
linear combinations of the the components of the original state vector, 
Vt = [Vb , Vi , V2] . The orientation of the image is such that the original 
"V2" axis is projected onto the new "y" axis. (The direction of "up" is 
preserved.) The negative numbers on the axes represent negative offsets 
rather than negative probabilities. This is the same concept that is used 
when we write down a probability (1 — p). If p is a valid probability then 
so is (1 — p). The number —p is an offset that just happens to be negative. 
The dimension of this fractal is I? « r^^Trr ~ 1.585. We define the amount 

log (4) 

of Parrondo effect, Ap, as the difference in rate of return, Y, between the 
mixed sequence of games {Gq, Gi, G2} and the best performance from any 
pure sequence of a single game. For this limiting case, Ap « 0. There are are 
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some interesting qualitative relationships between the Hausforff dimension 
and the amount of Parrondo effect which deserve further investigation to 
see if it is possible to state a general quantitative law. 



1.9 Summary 

In this paper we have analysed Parrondo's games in terms of the theory 
of Markov chains with rewards. We have illustrated the concepts construc- 
tively, using a very simple two-state version of Parrondo's games and we 
have shown how this gives rise to fractal geometry in the state-space. We 
have arrived at a simple method for calculating the expected value of the 
asymptotic rate of reward from these games and we have shown that this 
can be calculated in terms of an equivalent time-averaged game. We have 
used graphic representations of trajectories and attractors in state-space to 
motivate some of the arguments. 

The use of state-space concepts opens up new lines of enquiry. Simulation 
and visualisation encourage intuition and help us to grasp the essential 
features of a new system. This would be much more difficult if we were 
to use a purely formal algebraic approach at the start. We do not propose 
visualisation as a replacement for rigorous analysis. We see it as a guide to 
help us to decide which problems are worthy of more detailed attention and 
which problems might later yield to a more formal approach. We believe 
that state-space visualisation will be as useful for the study of the dynamics 
of Markov chains as it has already been for the study of other dynamical 
systems. 

Finally, we conclude that Parrondo's games are not really "paradoxi- 
cal" in the true sense. The anomaly arises because the reward process is a 
non-linear function of the Markov transition operators and our "common 
sense" tells us the reward process "ought" to be linear. When we combine 
the games by selecting them at random, we perform a linear convex com- 
bination of the operators but the expected asymptotic value of the rewards 
from this combined process is not a linear combination of the rewards from 
the original games. 
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